COMPASS: joint copy number and mutation phylogeny reconstruction from amplicon single-cell sequencing data

Reconstructing the history of somatic DNA alterations can help understand the evolution of a tumor and predict its resistance to treatment. Single-cell DNA sequencing (scDNAseq) can be used to investigate clonal heterogeneity and to inform phylogeny reconstruction. However, most existing phylogenetic methods for scDNAseq data are designed either for single nucleotide variants (SNVs) or for large copy number alterations (CNAs), or are not applicable to targeted sequencing. Here, we develop COMPASS, a computational method for inferring the joint phylogeny of SNVs and CNAs from targeted scDNAseq data. We evaluate COMPASS on simulated data and apply it to several datasets including a cohort of 123 patients with acute myeloid leukemia. COMPASS detected clonal CNAs that could be orthogonally validated with bulk data, in addition to subclonal ones that require single-cell resolution, some of which point toward convergent evolution.

2 The hyperparameters for tree priors p1-p4 have to be chosen empirically and seem to be tuned specifically for AML in the presentation. Since liquid tumors can be quite different from solid tumors, it is unknown whether extensive prior knowledge is required for the users to pick these parameters?
3 The analysis of the AML dataset yielded many results. It is hard to gauge what is novel both methodologically and biologically. a) What findings are unique to COMPASS that is not identifiable in other methods? b) Biologically, what novel insights the new analysis tell us about AML evolution? I found many of these findings are known earlier. It will be important to highlight the novelty. 4) Several points presented in the work is not accurate: a) The authors claimed that " BiTSC2 does not allow CNLOH events and might therefore falsely interpret them as copy number losses. In addition, it does not allow losses of the mutated allele and only allows a copy-number gain of the mutated allele when it occurs in the same node as the corresponding SNV.". This is inaccurate. BiTSC2 do account for loss of mutation. In fact, the phase indicator in the original paper's fig1d has shown that both copy gain and loss are allowed to occur on the mutated allele in BiTSC2. b) In addition, the authors claim that "BiTSC2 only uses the coverage at SNVs to detect CNVs, which might miss CNVs in regions without SNVs". BiTSC2 also considers genomic regions with only CNA events (Fig1B in the paper). BiTSC2 offers to jointly infer CN event in consecutive regions within one segment, when segmentation info is available. c) The abstract of the work is written as if COMPASS is the first method jointly modeling CNV and SNV. This is not true. In other words, the "gap in the field" is not properly described.
Reviewer #2 (Remarks to the Author): This paper introduces COMPASS, a tumor phylogeny inference method of SNVs and CNAs from single-cell panel sequencing data. The focus is on MissionBio Tapestri sequencing technology. CNA states include loss, gain and copy-neutral loss of heterozygosity and SNVs are phased with CNAs, i.e. allele-specific CNAs. The method is evaluated on real and simulated data. As this is the first method to simultaneously infer CNAs and SNVs in a phylogeny, I am positive about the contribution of the paper. However, I have several concern that I will discuss below.

Experimental evaluation
Simulations should be extended to assess: -MP3 similarity combines SNV and CNA accuracy. Please separate these out. There seems to be an imbalance in number of SNVs and CNAs --so the MP3 score might be dominated by SNVs.
-Can subclonal CNA events be inferred? -How well can the method infer integer copy-numbers beyond gain/loss? I have the following comments about the real data: -How did the competing methods (SCITE and BiTSC2) perform on real data? 2. Methods -My biggest concern is that a description of the used evolutionary model is missing. Can an SNV be introduced multiple times on the tree? Can an SNV be lost after introduction, if so, how many times. Can the same region be affected by multiple CNAs? Please make this clear.
Relatedly, Section B needs to be written more carefully. Several definitions are missing. In particular, the definition of an event is missing.
-My other concern is that the algorithm consists of many hardcoded values, including the doublet rate, several overdispersion parameters, the allele-specific ADO rate, etc. I would like to understand how these values were chosen, and if they need to be altered depending on the data?
One particular example is lambda, which was set to e^-70. Does it ever happen that the algorithm uses different allelic dropout rates? Why not just remove this? -I do not like the separation of CNV (which include gains and losses) and CNLOH, a copy-neutral loss of heterozygosity is a type of CNA in addition to copy-number gains and losses.
-Section "Likelihood for the number of reads" needs more motivation. First, it is unclear why there is a factor of 1/2 in E [D_{kj}]. Second, the number of reads of a region does not only depend on the copy number of the region itself but also the copy-number of other regions. The larger the copy numbers of other regions are, the smaller the number of reads of the region in question.
-Are rho_k values consistent across patients that use the same set of amplicons? This is another form of orthogonal validation. Alternatively, should these be inferred a priori using a cohort level analysis?
-Section A.2 describes a generative process, but the starting point is unclear. It is seems that you assume that a tree with N nodes with reference and variant copies for each locus has been drawn already. Please describe this more carefully.
-How is the SA algorithm initialized? That is, how are initial trees generated? -Lambda was set to e^-70. Does it ever happen that the algorithm uses different allelic dropout rates? Why not just remove this? -The plate diagram in Figure A.1 indicates that A_{ij} depends on D_{kj}. Should this be D_{ij} instead?

Data availability
As far as I can tell, The repository contains only one simulated instance and a handful of real data instanecs. Please upload all simulated and processed real data to facilitate reproducibility. The authors are extremely confident in their findings and the reply to the reviewers'comments. Even though target amplicon sequencing is a new development in the field, it is not yet a mainstream approach. The authors did make significant progress comparing to previous methods such as BiTSC2 on the amplicon sequencing data. However, the simulations were constructed in such a way that those settings are extremely friendly to their own method, while disfavoring previous approaches (e.g. BiTSC2). I feel this is not fair to the colleagues who developed those methods earlier and can mislead the readers. In addition, the authors highly tuned their method on the AML dataset, which makes it hard to know the generality of these performances on other real datasets.

RESPONSE TO REVIEWERS' COMMENTS
Reviewer #2 (Remarks to the Author): The revised manuscript has addressed my previous concerns.

RESPONSE TO REVIEWERS' COMMENTS
We are grateful to the reviewers for their careful reading of the revised manuscript and their thoughtful suggestions. We have addressed all reviewer comments and adapted our manuscript accordingly.
In the following, reviewer comments are in black font, our responses in blue.
Reviewer #1 (Remarks to the Author): The authors are extremely confident in their findings and the reply to the reviewers'comments. Even though target amplicon sequencing is a new development in the field, it is not yet a main-stream approach. The authors did make significant progress comparing to previous methods such as BiTSC2 on the amplicon sequencing data. However, the simulations were constructed in such a way that those settings are extremely friendly to their own method, while disfavoring previous approaches (e.g. BiTSC2). I feel this is not fair to the colleagues who developed those methods earlier and can mislead the readers. In addition, the authors highly tuned their method on the AML dataset, which makes it hard to know the generality of these performances on other real datasets.
The simulation settings were designed to be as close as possible to targeted single-cell DNAseq data, for which our method is specifically designed. We believe that targeted single-cell DNAseq will likely be used much more often in the future due to its cost efficiency. One important difference of targeted scDNAseq as compared to whole-genome scDNAseq is that the coverage is not uniform, while other methods like BITSC2 assume that the coverage is uniform. The non-uniformity of the coverage has a very strong impact on the performance of BITSC2 and with these settings, BITSC2 performed poorly ( Figure 2). When we applied BITSC2 to real targeted sequencing data, the trees generated by BITSC2 contained many false CNAs which were not detected in bulk data (Figure 3), so we believe that our simulations accurately reflect true targeted sequencing data. Since in our previous version we only focused on targeted sequencing data, it is true that the previous version of the manuscript give the misleading impression that BITSC2 performs poorly in all settings, even with a uniform coverage. In order to provide a fairer comparison and a more complete picture, we have now separated Figure 2 into two main simulations: one where the coverage is uniform, which acknowledges that BITSC2 performs well in this setting, and one with non-uniform coverage, which shows the advancement of COMPASS in this setting.
Concerning the applicability of our method to other datasets, we have already in the previous revision added two additional datasets. They are admittedly still from blood malignancies because to our knowledge no similar datasets from solid tumors have been published so far, but they show that the method works well for other datasets generated in different labs and with different amplicons.